{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Chaining and Teeing Iterators"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Often we need to chain iterators/iterables together to behave like a single iterable.\n",
    "\n",
    "We can think of this as analogous to sequence concatenation.\n",
    "\n",
    "For example, suppose we have some generators producing squares:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "l1 = (i**2 for i in range(4))\n",
    "l2 = (i**2 for i in range(4, 8))\n",
    "l3 = (i**2 for i in range(8, 12))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we want to essentially iterate through all the values as if they were a single iterator.\n",
    "\n",
    "We could do it this way:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "for gen in (l1, l2, l3):\n",
    "    for item in gen:\n",
    "        print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In fact, we could even create our own generator function to do this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def chain_iterables(*iterables):\n",
    "    for iterable in iterables:\n",
    "        yield from iterable"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "l1 = (i**2 for i in range(4))\n",
    "l2 = (i**2 for i in range(4, 8))\n",
    "l3 = (i**2 for i in range(8, 12))\n",
    "\n",
    "for item in chain_iterables(l1, l2, l3):\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But, a much simpler way is to use `chain` in the `itertools` module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from itertools import chain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "l1 = (i**2 for i in range(4))\n",
    "l2 = (i**2 for i in range(4, 8))\n",
    "l3 = (i**2 for i in range(8, 12))\n",
    "\n",
    "for item in chain(l1, l2, l3):\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that `chain` expects a variable number of positional arguments, each of which should be an iterable.\n",
    "\n",
    "It will not work if we pass it a single iterable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<generator object <genexpr> at 0x0000020AAFE06F10>\n",
      "<generator object <genexpr> at 0x0000020AAFDB43B8>\n",
      "<generator object <genexpr> at 0x0000020AAFE06FC0>\n"
     ]
    }
   ],
   "source": [
    "l1 = (i**2 for i in range(4))\n",
    "l2 = (i**2 for i in range(4, 8))\n",
    "l3 = (i**2 for i in range(8, 12))\n",
    "\n",
    "lists = [l1, l2, l3]\n",
    "for item in chain(lists):\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, it simply took our list and handed it back directly - there was nothing else to chain with:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "l1 = (i**2 for i in range(4))\n",
    "l2 = (i**2 for i in range(4, 8))\n",
    "l3 = (i**2 for i in range(8, 12))\n",
    "\n",
    "lists = [l1, l2, l3]\n",
    "for item in chain(lists):\n",
    "    for i in item:\n",
    "        print(i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead, we could use unpacking:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "l1 = (i**2 for i in range(4))\n",
    "l2 = (i**2 for i in range(4, 8))\n",
    "l3 = (i**2 for i in range(8, 12))\n",
    "\n",
    "lists = [l1, l2, l3]\n",
    "for item in chain(*lists):\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Unpacking works with iterables in general, so even the following would work just fine:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def squares():\n",
    "    yield (i**2 for i in range(4))\n",
    "    yield (i**2 for i in range(4, 8))\n",
    "    yield (i**2 for i in range(8, 12))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "for item in chain(*squares()):\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But, unpacking is not lazy!! Here's a simple example that shows this, and why we have to be careful using unpacking if we want to preserve lazy evaluation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def squares():\n",
    "    print('yielding 1st item')\n",
    "    yield (i**2 for i in range(4))\n",
    "    print('yielding 2nd item')\n",
    "    yield (i**2 for i in range(4, 8))\n",
    "    print('yielding 3rd item')\n",
    "    yield (i**2 for i in range(8, 12))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def read_values(*args):\n",
    "    print('finised reading args')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "yielding 1st item\n",
      "yielding 2nd item\n",
      "yielding 3rd item\n",
      "finised reading args\n"
     ]
    }
   ],
   "source": [
    "read_values(*squares())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead we can use an alternate \"constructor\" for chain, that takes a single iterable (of iterables) and lazily iterates through the outer iterable as well:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "c = chain.from_iterable(squares())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "yielding 1st item\n",
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "yielding 2nd item\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "yielding 3rd item\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "for item in c:\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note also, that we can easily reproduce the same behavior of either chain quite easily:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def chain_(*args):\n",
    "    for item in args:\n",
    "        yield from item"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def chain_iter(iterable):\n",
    "    for item in iterable:\n",
    "        yield from item"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can use those just as we saw before with `chain` and `chain.from_iterable`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "yielding 1st item\n",
      "yielding 2nd item\n",
      "yielding 3rd item\n"
     ]
    }
   ],
   "source": [
    "c = chain_(*squares())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "yielding 1st item\n",
      "0\n",
      "1\n",
      "4\n",
      "9\n",
      "yielding 2nd item\n",
      "16\n",
      "25\n",
      "36\n",
      "49\n",
      "yielding 3rd item\n",
      "64\n",
      "81\n",
      "100\n",
      "121\n"
     ]
    }
   ],
   "source": [
    "c = chain_iter(squares())\n",
    "for item in c:\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### \"Copying\" an Iterator"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sometimes we may have an iterator that we want to use multiple times for some reason.\n",
    "\n",
    "As we saw, iterators get exhausted, so simply making multiple references to the same iterator will not work - they will just point to the same iterator object.\n",
    "\n",
    "What we would really like is a way to \"copy\" an iterator and use these copies independently of each other."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use `tee` in `itertools`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from itertools import tee"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def squares(n):\n",
    "    for i in range(n):\n",
    "        yield i**2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<generator object squares at 0x0000020AAFE067D8>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gen = squares(10)\n",
    "gen"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "iters = tee(squares(10), 3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(<itertools._tee at 0x20aafe6b608>,\n",
       " <itertools._tee at 0x20aafe6bdc8>,\n",
       " <itertools._tee at 0x20aafe6b448>)"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "iters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tuple"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(iters)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see `iters` is a **tuple** contains 3 iterators - let's put them into some variables and see what each one is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "iter1, iter2, iter3 = iters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0, 1, 4)"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "next(iter1), next(iter1), next(iter1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0, 1)"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "next(iter2), next(iter2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "next(iter3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, `iter1`, `iter2`, and `iter3` are essentially three independent \"copies\" of our original iterator (`squares(10)`)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that this works for any iterable, so even sequence types such as lists:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "l = [1, 2, 3, 4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "lists = tee(l, 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<itertools._tee at 0x20aafe6b688>"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lists[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<itertools._tee at 0x20aafe6b048>"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lists[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But you'll notice that the elements of `lists` are not lists themselves!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4]"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(lists[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(lists[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, the elements returned by `tee` are actually `iterators` - even if we used an iterable such as a list to start off with!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lists[1] is lists[1].__iter__()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "'__next__' in dir(lists[1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Yep, the elements of `lists` are indeed iterators!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
